Properties of Embedding Methods for Similarity Searching in Metric Spaces
نویسندگان
چکیده
Complex data types—such as images, documents, DNA sequences, etc.—are becoming increasingly important in modern database applications. A typical query in many of these applications seeks to find objects that are similar to some target object, where (dis)similarity is defined by some distance function. Often, the cost of evaluating the distance between two objects is very high. Thus, the number of distance evaluations should be kept at a minimum, while (ideally) maintaining the quality of the result. One way to approach this goal is to embed the data objects in a vector space so that the distances of the embedded objects approximates the actual distances. Thus, queries can be performed (for the most part) on the embedded objects. In this paper, we are especially interested in examining the issue of whether or not the embedding methods will ensure that no relevant objects are left out (i.e., there are no false dismissals and, hence, the correct result is reported). Particular attention is paid to the SparseMap, FastMap, and MetricMap embedding methods. SparseMap is a variant of Lipschitz embeddings, while FastMap and MetricMap are inspired by dimension reduction methods for Euclidean spaces (using KLT or the related PCA and SVD). We show that, in general, none of these embedding methods guarantee that queries on the embedded objects have no false dismissals, while also demonstrating the limited cases in which the guarantee does hold. Moreover, we describe a variant of SparseMap that allows queries with no false dismissals. In addition, we show that with FastMap and MetricMap, the distances of the embedded objects can be much greater than the actual distances. This makes it impossible (or at least impractical) to modify FastMap and MetricMap to guarantee no false dismissals.
منابع مشابه
New Approaches to Similarity Searching in Metric Spaces
Title of dissertation: NEW APPROACHES TO SIMILARITY SEARCHING IN METRIC SPACES Cengiz Celik, Doctor of Philosophy, 2006 Dissertation directed by: Professor David Mount Department of Computer Science The complex and unstructured nature of many types of data, such as multimedia objects, text documents, protein sequences, requires the use of similarity search techniques for retrieval of informatio...
متن کاملOn M-tree Variants in Metric and Non-metric Spaces
Although there have been many metric access methods (MAMs) developed so far to solve the problem of similarity searching, there is still big need for gapping retrieval efficiency. One of the most acceptable MAMs is M-tree which meets the essential features important for large, persistent and dynamic databases. M-tree’s retrieval inefficiency is hidden in overlaps of its regions, therefore, its ...
متن کاملFixed Point Theorems For Weak Contractions in Dualistic Partial Metric Spaces
In this paper, we describe some topological properties of dualistic partial metric spaces and establish some fixed point theorems for weak contraction mappings of rational type defined on dual partial metric spaces. These results are generalizations of some existing results in the literature. Moreover, we present examples to illustrate our result.
متن کاملOn metric spaces induced by fuzzy metric spaces
For a class of fuzzy metric spaces (in the sense of George and Veeramani) with an H-type t-norm, we present a method to construct a metric on a fuzzy metric space. The induced metric space shares many important properties with the given fuzzy metric space. Specifically, they generate the same topology, and have the same completeness. Our results can give the constructive proofs to some probl...
متن کاملOn Generalized Injective Spaces in Generalized Topologies
In this paper, we first present a new type of the concept of open sets by expressing some properties of arbitrary mappings on a power set. With the generalization of the closure spaces in categorical topology, we introduce the generalized topological spaces and the concept of generalized continuity and become familiar with weak and strong structures for generalized topological spaces. Then, int...
متن کاملAlgebraic distance in algebraic cone metric spaces and its properties
In this paper, we prove some properties of algebraic cone metric spaces and introduce the notion of algebraic distance in an algebraic cone metric space. As an application, we obtain some famous fixed point results in the framework of this algebraic distance.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Pattern Anal. Mach. Intell.
دوره 25 شماره
صفحات -
تاریخ انتشار 2003